{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "8442727e",
   "metadata": {},
   "source": [
    "## Lesson 4: In-place instruction rewriting"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "003eb910",
   "metadata": {},
   "source": [
    "**Objectives**: use OFRAK's Ghidra backend; use more filtering capabilities to find specific complex blocks and instructions; assemble an instruction using Keystone; rewrite an instruction in-place"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e2ab0236",
   "metadata": {},
   "source": [
    "In this section, we'll rewrite the `ret` instruction so that the binary loops back to its beginning instead of returning and exiting at the end of the main function."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "6bf1aa8b",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Using OFRAK Community License.\n"
     ]
    }
   ],
   "source": [
    "from ofrak import OFRAK\n",
    "from ofrak_tutorial.helper_functions import create_hello_world_binary\n",
    "\n",
    "create_hello_world_binary()\n",
    "\n",
    "ofrak = OFRAK()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9f976b7d",
   "metadata": {},
   "source": [
    "This time, we want to analyze the binary fully, down to the instruction level. We'll do that by using OFRAK's Ghidra backend. Let's create a more powerful OFRAK context:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "5c0bcf74",
   "metadata": {},
   "outputs": [],
   "source": [
    "import ofrak_ghidra\n",
    "\n",
    "ofrak.discover(ofrak_ghidra)\n",
    "\n",
    "binary_analysis_context = await ofrak.create_ofrak_context()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4f416f82",
   "metadata": {},
   "source": [
    "Let's unpack recursively again, now that Ghidra is loaded into OFRAK."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "041ab1f8",
   "metadata": {
    "tags": [
     "nbval-ignore-output"
    ]
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "openjdk version \"11.0.23\" 2024-04-16\n",
      "OpenJDK Runtime Environment (build 11.0.23+9-post-Debian-1deb11u1)\n",
      "OpenJDK 64-Bit Server VM (build 11.0.23+9-post-Debian-1deb11u1, mixed mode)\n",
      "openjdk version \"11.0.23\" 2024-04-16\n",
      "OpenJDK Runtime Environment (build 11.0.23+9-post-Debian-1deb11u1)\n",
      "OpenJDK 64-Bit Server VM (build 11.0.23+9-post-Debian-1deb11u1, mixed mode)\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "components run: [b'CodeRegionUnpacker', b'ComplexBlockUnpacker', b'DecompilationAnalysisIdentifier', b'ElfDynamicSectionUnpacker', b'ElfPointerArraySectionUnpacker', b'ElfRelaUnpacker', b'ElfSymbolUnpacker', b'ElfUnpacker', b'GhidraAnalysisIdentifier', b'GhidraBasicBlockUnpacker', b'LinkableSymbolIdentifier', b'MagicIdentifier']\n",
      "309 resources created\n",
      "310 resources modified\n"
     ]
    }
   ],
   "source": [
    "root_resource = await binary_analysis_context.create_root_resource_from_file(\"hello_world\")\n",
    "ghidra_unpack_result = await root_resource.unpack_recursively()\n",
    "print(f\"components run: {sorted(ghidra_unpack_result.components_run)}\")\n",
    "print(f\"{len(ghidra_unpack_result.resources_created)} resources created\")\n",
    "print(f\"{len(ghidra_unpack_result.resources_modified)} resources modified\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e71a61c5",
   "metadata": {},
   "source": [
    "Do we have instructions this time?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "7f007f8b",
   "metadata": {},
   "outputs": [],
   "source": [
    "from ofrak.core import Instruction\n",
    "from ofrak_tutorial.helper_functions import get_descendants_tags\n",
    "\n",
    "assert Instruction in await get_descendants_tags(root_resource)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "457523f5",
   "metadata": {},
   "source": [
    "Good.\n",
    "\n",
    "**Complex Blocks** in OFRAK are sets of basic blocks representing a logical unit of code. In particular, all functions are complex blocks.\n",
    "\n",
    "How do we get the complex block corresponding to \"main\"?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "61c9f653",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "ComplexBlock(virtual_address=4198694, size=26, name='main')\n"
     ]
    }
   ],
   "source": [
    "from ofrak.core import ComplexBlock\n",
    "from ofrak import ResourceFilter, ResourceAttributeValueFilter\n",
    "\n",
    "\n",
    "async def get_main_complex_block(root_resource):\n",
    "    return await root_resource.get_only_descendant_as_view(\n",
    "        v_type=ComplexBlock,\n",
    "        r_filter=ResourceFilter(\n",
    "            attribute_filters=(ResourceAttributeValueFilter(ComplexBlock.Symbol, \"main\"),)\n",
    "        ),\n",
    "    )\n",
    "\n",
    "\n",
    "main_cb = await get_main_complex_block(root_resource)\n",
    "print(main_cb)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "46fc5fee",
   "metadata": {},
   "source": [
    "`resource.get_only_descendant_as_view` is a shortcut to:\n",
    "- get the only descendant matching the filter `r_filter` (asserting there is one and only one such descendant);\n",
    "- get it as a resource view of `v_type` (in this case, `ComplexBlock`)."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1c2467b5",
   "metadata": {},
   "source": [
    "Getting the only `ret` instruction in this complex block is a similar process:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "1c75a3fd",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Instruction(virtual_address=4198719, size=1, mnemonic='ret', operands='', mode=<InstructionSetMode.NONE: 0>)\n"
     ]
    }
   ],
   "source": [
    "async def get_complex_block_ret_instruction(complex_block):\n",
    "    return await main_cb.resource.get_only_descendant_as_view(\n",
    "        v_type=Instruction,\n",
    "        r_filter=ResourceFilter(\n",
    "            attribute_filters=(ResourceAttributeValueFilter(Instruction.Mnemonic, \"ret\"),)\n",
    "        ),\n",
    "    )\n",
    "\n",
    "\n",
    "ret_instruction = await get_complex_block_ret_instruction(main_cb)\n",
    "print(ret_instruction)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ccfa5c75",
   "metadata": {},
   "source": [
    "Let's assemble our new instruction that loops to the start of the \"main\" complex block:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "ff76f15a",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "b'\\xeb\\xe5'"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from ofrak.core import ProgramAttributes\n",
    "from ofrak.service.assembler.assembler_service_keystone import KeystoneAssemblerService\n",
    "\n",
    "\n",
    "async def get_looping_instruction(main_cb, ret_instruction, program_attributes) -> bytes:\n",
    "    assembler_service = KeystoneAssemblerService()\n",
    "    return await assembler_service.assemble(\n",
    "        assembly=f\"jmp {main_cb.virtual_address}\",\n",
    "        vm_addr=ret_instruction.virtual_address,\n",
    "        program_attributes=program_attributes,\n",
    "    )\n",
    "\n",
    "\n",
    "program_attributes = await root_resource.analyze(ProgramAttributes)\n",
    "\n",
    "await get_looping_instruction(main_cb, ret_instruction, program_attributes)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8b90964f",
   "metadata": {},
   "source": [
    "Looks good. Let's put this all together:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "ab7e0a74",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Hello, World!\n",
      "Hello, World!\n",
      "Hello, World!\n",
      "Hello, World!\n",
      "Hello, World!\n",
      "...\n"
     ]
    }
   ],
   "source": [
    "import subprocess\n",
    "\n",
    "from ofrak.core.binary import BinaryPatchModifier, BinaryPatchConfig\n",
    "\n",
    "\n",
    "async def chase_tail(ofrak_context, input_filename, output_filename):\n",
    "    # In a real script, we would run the two lines below... But let's be lazy and reuse\n",
    "    # the already unpacked root_resource that we defined in the global scope.\n",
    "    # root_resource = await ofrak_context.create_root_resource_from_file(input_filename)\n",
    "    # await root_resource.unpack_recursively()\n",
    "    main_cb = await get_main_complex_block(root_resource)\n",
    "    ret_instruction = await get_complex_block_ret_instruction(main_cb)\n",
    "    program_attributes = await root_resource.analyze(ProgramAttributes)\n",
    "    looping_instruction = await get_looping_instruction(\n",
    "        main_cb, ret_instruction, program_attributes\n",
    "    )\n",
    "\n",
    "    range_in_root = await ret_instruction.resource.get_data_range_within_root()\n",
    "    await root_resource.run(\n",
    "        BinaryPatchModifier,\n",
    "        BinaryPatchConfig(\n",
    "            offset=range_in_root.start,\n",
    "            patch_bytes=looping_instruction,\n",
    "        ),\n",
    "    )\n",
    "\n",
    "    await root_resource.pack()\n",
    "    await root_resource.flush_data_to_disk(output_filename)\n",
    "\n",
    "\n",
    "await chase_tail(binary_analysis_context, \"hello_world\", \"hello_world_forever\")\n",
    "stdout = subprocess.run(\n",
    "    \"chmod +x hello_world_forever && timeout 1s ./hello_world_forever\",\n",
    "    shell=True,\n",
    "    stdout=subprocess.PIPE,\n",
    ").stdout.decode(\"utf-8\")\n",
    "print(stdout[0:70] + \"...\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5f82da20",
   "metadata": {},
   "source": [
    "Someone is chasing its tail and never catching it 😹"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9d25ddcf",
   "metadata": {},
   "source": [
    "[Next page](5_filesystem_modification.ipynb)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.18"
  },
  "pycharm": {
   "stem_cell": {
    "cell_type": "raw",
    "metadata": {
     "collapsed": false
    },
    "source": []
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
